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Abstract 

£"} , Green [Geometric and Functional Analysis 15 (2005), 340-376] es- 

r \ ' tablished a version of the Szemeredi Regularity Lemma for abelian 

groups and derived the Removal Lemma for abelian groups as its corol- 

i-^ ' lary. We provide another proof of his Removal Lemma that allows us 

d . to extend its statement to all finite groups. We also discuss possible 

extensions of the Removal Lemma to systems of equations. 

1 Introduction 
O . 

■<^j- , A consequence of the celebrated Szemeredi Regularity Lemma for graphs 

is the so-called Removal Lemma. In its simplest formulation, the lemma 
states that a graph with o(n 3 ) triangles can be transformed into a triangle- 
free graph by removing only o(n 2 ) edges. Green [2] has recently established a 
version of the Szemeredi Regularity Lemma for abelian groups and derived, 
as a consequence of his result, the Removal Lemma for abelian groups: 



Theorem 1 (Green [2], Theorem 1.5). Let G be a finite abelian group of 
order N. Let m > 3 be an integer, and suppose that A\, . . . , A m are subsets 



of G such that there are o(N m ) solutions to the equation a\ + . . - + a m = 
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with a>i € Ai for all i. Then, it is possible to remove o(N) elements from 
each set Ai so as to obtain sets A[ such that there is no solution of the 
equation a'i + . . . + a! m = with a\ G A\ for all i. 

Rigorously speaking, Theorem [T] asserts that for every 5 > there exists 
6'(5,m) such that, if the equation a\ + . . . + a m = has less than b~N rn ~ l 
solutions with a^ € Ai, then there are subsets A^ C A%, \A% \ A'A < S'N such 
that the equation a\ + . . . + a m = has no solution with Oj € A[, and the 
value of 5' tends to as 5 — > 0. Let us emphasize that the value of 8' does 
not depend on the order ./V of the group G (nor its structure). 

The proof of the Regularity Lemma for abelian groups in [2] relies heav- 
ily on Fourier analysis techniques, and thus the result is restricted only to 
abelian groups. In this paper we provide a proof of Theorem [T] building 
on combinatorial methods and thus we are able to generalize the result to 
arbitrary finite groups. In particular, we prove the following extension of 
Theorem [TJ 

Theorem 2. Let G be a finite group of order N . Let A\, . . . , A m , m > 2, be 
sets of elements of G and let g be an arbitrary element ofG. If the equation 
X1X2 • • • x m = g has o(N m ~ 1 ) solutions with x% £ Ai, then there are subsets 
A[ C Ai with \Ai\A' i \ = o(N) such that there is no solution of the equation 
X1X2 • • • x m = g with ij £ i-. 

We use the multiplicative notation in Theorem [2] to emphasize that the 
group G need not be abelian. 

Our technique also allows to extend Theorem Q] to equation systems of 
a certain type. Let G be an abelian group (with additive notation) and 
consider an equation system of the following type: 

eiuci + • • • + ei m x m = 

CjfclXl + • • • + e km Xm = 

where e^ G {—1,0,1}, k > 1 and m > 2. The vector (e^i, . . . , ej m ) is 
referred to as the characteristic vector of the i-th equation. We say that the 
system ([1]) is graph representable by a directed graph H with m arcs (one 
for each variable in the system) if the characteristic vectors of cycles in H 
are precisely integer linear combinations of the characteristic vectors of the 
equations, see Section [3] for details. With this notation, our second result is 
the following: 



Theorem 3. Let G be a finite abelian group of order N . Let A\, . . . , A m , 
m > 2, be sets of elements of G. If the equation system HJ) is graph- 
representable and has o(N m ) solutions with Xi G Ai, then there are subsets 
A[ C Ai with \Ai \A'j\= o(N) such that there is no solution of the system 
(OP with Xi E A[. 

Theorem [3] can be also extended to non-abelian groups at the expense of 
strengthening the notion of graph representability. With this strong version 
which is explained in Section the same technique allows us to prove: 

Theorem 4. Let G be a finite group of order N written multiplicatively. 
Let A±, . . . ,A m , m > 2, be sets of elements of G. Consider the equation 
system 



x € l^l(l) . . . x € ^l(m) _ 1 

<ri(l) ci(m) 



x ek "hW . . . x £fcCT ft(™) 



(2) 



where a\, . . . , <7& are permutations of [1, m], eij G { — 1, 0, 1}, k > 1, m > 2. 
If the system is strongly graph-representable and has o(N m ~ k ) solutions 
with Xi G Ai, then there are subsets A\ C Ai with \Ai \A'A = o(N) such that 
there is no solution of the system ^ with Xi £ A\. 

Before proceeding further, let us present a corollary of Theorem [4] which 
illustrates possible applications of our results. Let G be a finite group 
and A, B C G. The representation function rA,B ■ G — > N, defined as 
i'A,B{g) = I {(a, b) G A x B : ab = g}\, counts the number of representations 
of an element j£Gasa product of an element in A and one in B. We 
write rA for r^,A- 

Corollary 5. Let G be a finite group of order N and let A, B, C,D,E C G. 
If 

^^2r AjB (g)r c ,D(9)=o(N 2 ) , 

geE 

then it is possible to eliminate o(N) elements in each of the sets to obtain 
sets A f ,B',C',D',E f such that 

^2 r A',B'(g)rc,D'(g) = . 

g&E' 

In particular, 

L V-kEgeE r\{g) = o(N 2 ), then (A') 2 HE' = (A' is E' -product-free) . 



2- Ifj?Z g eG\Ar 2 A (g) = o(N 2 ), then \(A') 2 \ = \A'\ + o(N) (A' has small 
doubling). 

3- If-kE g eG r A,B(g) r BAg) = o(N 2 ), then \A'B'nB'A'\ = o(N) (almost 
all pairs do not commute). 

Proof. Consider the following equation system: 

xix 2 x^ 1 x^ 1 = 1 1 . . 

X\X 2 X^ 1 = 1 J 

The system ([3]) is strongly representable by the graph H depicted in Figured! 




Figure 1: A directed graph representing the equation system (|3|). 

The number of solutions of © with x\ G A, x 2 € B, X3 € C, X4 € D 
and X5 € -E 1 is Yln^E r A,B{g)i'c,D{g)- Hence, if it holds that 

■^Y, rA Mg>cMg) = o(N 2 ), 

g£E 

then there are o(N 3 ) solutions of the system ([3]). By Theorem [5] applied 
with m = 5, k = 2, A 1 = A,A 2 = B,A 3 = C,A 4 = D and A 5 = E, it is pos- 
sible to remove o(N) elements from each of the sets A, . . . , E obtaining sets 
A' , . . . ,E' such that the system ([3]) has no solution with x\ G A', . . . , x§ G E'. 
Applying the above argument with A = B = C = D, we obtain that 
^2q£E' r A'(9) = 0> w hich is equivalent to (A') 2 n E' = 0. This proves 1. 
Setting E = G \ A, we get J2 g eE> r A'(9) = °- Since i A ') 2 QAU(E\E'), 
\A \ A'\ = o(N), \E \ E'\ = o(N), we obtain 2. Similarly, 3 is derived by 
applying the Corollary for A = C and B = D. □ 

2 Removal Lemma for groups 

In our arguments, the following consequence of a variant of Szemeredi Regu- 
larity Lemma for directed graphs becomes useful: 



Lemma 6 (Alon and Shapira pQ, Lemma 4.1). Let H be a fixed directed 
graph of order h. If G contains less than o(n h ) copies of H , there exists a 
set E of at most o(n 2 ) arcs of G such that the graph obtained from G by 
removing the arcs of E is H-free. 

The proofs of Theorems [2j [3] and U] consist in constructing a blow-up 
graph of a small graph H (which is a cycle in the case of Theorem [2]) such 
that any solution of the equations gives rise to N copies of H and every 
copy of H comes in fact from a solution of the equations. We then apply the 
removal lemma for graphs and, by a pigeonhole principle, reduce the o(N 2 ) 
arcs from Lemma [6] to the o(N) elements stated in Theorem [2j 

Proof of Theorem^ Fix So > and m > 2. Let G be a finite group of order 
N, let g be an element of G and let A±, . . . , A m be sets of elements of G. 

We define an auxiliary directed graph H$ whose vertex set is the set 
G x {1, . . . ,m}, i.e., they are pairs formed by an element of the group G 
and an integer between 1 and m. There is an arc in H$ from a vertex (x, i), 
1 < % < m — 1, to a vertex (y,i + 1) if there exists an element a% € A{ 
such that xai = y. This arc is labeled by the pair [o», i]. Ho also contains 
an arc from a vertex (x, m) to a vertex (y, 1) if there exists an element 
o-rn £ A m such that xa m g~ l = y. This arc is labeled by the pair [a m ,m]. 
Let iVo = mN denote the order of Ho- Note that, for each element a% E Ai, 
Ho contains exactly N arcs labelled with [aj,i]. 

Observe that any directed cycle of Ho with length m gives a solution 
of the equation: if [oi, 1], [02, 2], . . . , [a m ,m] are the labels of the arcs in 
the cycle and it contains the vertex (z, 1), then za\a2 ■ ■ ■ a m g~ l = z by 
the definition of Ho- In the opposite way, each solution a\, . . . ,a m of © 
corresponds to ./V edge disjoint directed cycles of length m in Hq: 

(z,l),(za 1 ,2),(za 1 a 2 ,3), . . . ,(za\ . . .o m _i,m), {za\ ...a m g~ l ,l) = (z,l) 

(4) 
one for each of the N distinct possible choices of z G G. 

Suppose, using the hypothesis, that there are less than SoN" 1 ^ 1 solutions 
of the equation 

£1X2 • ■ ■ x m = g with Xi G A{. (5) 

By the correspondence of the cycles of Ho and the solutions of ©, the 
directed graph Ho contains no more than SoN m distinct directed cycles of 
length m. 

Apply Lemma [6] to Ho and the directed cycle of length m with S = 
So/m m : since Hq has less than SoN m = SN™ copies of the directed cycle of 



length m, there is a set E of at most S'Nq arcs such that Hq — E contains 
no directed cycle of length m with some 5' depending only on 5 and m. 

Let Bi be the set of those elements a G Ai such that E contains at least 
N/m arcs labeled with [a,i]. Since \E\ < S'Nq, the size of each Bi is at most 
m\E\/N < 5'm 3 N. Set A[ = A { \ B { . Since the size of B { is bounded by 
S'm 3 N, 5' depends on 5 and m only, and <5' — ► as 5o ~ > 0; the theorem will 
be proven after we show that there is no solution of the equation (0) with 
a, G A[. 

Assume that there is a solution with Oj G A^ of the equation © . Consider 
the N edge disjoint directed cycles of length m corresponding to a\, . . . ,a m 
which are given by ([4]) . Each of these N cycles contains at least one of the 
arcs of E and the arcs of these N edge disjoint cycles are labelled only with 
the pairs [a\, 1], [02, 2], . . . , [a m , m]. Since these directed cycles are disjoint, 
the set E contains at least N/m arcs labelled [di,i] for some 1 < i < m. 
Consequently, ai G Bi and thus aj G" A^. We conclude that there is no 
solution of (|5|) with Oj G A\. D 

3 Extensions to systems of equations 

Let us now recall the notion of cycle spaces of directed graphs. If H is a 
directed graph with m arcs, then the cycle space of H is the vector space 
over Q spanned by the characteristic vectors of cycles of H where the char- 
acteristic vector of a cycle C of H is the m-dimensional vector v with each 
coordinate associated with one of the arcs such that the i-th coordinate of v 
is +1 if the i-th arc is traversed by C in its direction, it is —1 if it is traversed 
by C in the reverse direction, and it is if the arc is not traversed by C. 

A set of integer vectors contained in the cycle space is said to integrally 
generate the cycle space of H if they are independent and every vector of the 
cycle space can be expressed as a linear combination of these vectors with 
integer coefficients. It is known [4 J that the vectors integrally generate the 
cycle space if and only if every maximum square submatrix of the matrix 
formed by these vectors has determinant 0, +1 or —1. This turns out to be 
equivalent to the fact that a determinant of one such non-singular submatrix 
is +1 or — 1. Let us now give some examples. If T is a spanning tree of H, 
then the characteristic vectors of the fundamental cycles with respect to T 
always integrally generate the cycle space of H pi]. On the other hand, an 
example of a set of characteristic vectors that generate but not integrally 
generate the cycle space of a graph is given in Figure EJ 

Consider now the equation system ([I]). The vector (e^i, . . . ,£j m ) is re- 




Figure 2: An example of a set of cycles generating but not integrally gen- 
erating the cycle space of a directed graph: the cycles eoeie2e3e4e5 and 



eoe^e^ ej e^~ generate the cycle space of the depicted directed graph 
but they do not integrally generate it: the cycle e§e\&2 can only be writ- 
ten as a rational (not integral) linear combination of the two cycles in the 
generating set. 



ferred to as the characteristic vector of the i— th equation. The system is 
said to be graph-representable if there exists a directed graph H with m 
arcs, each associated with one of the variables x%,... ,x m , such that the 
characteristic vectors of the equations integrally generate the cycle space of 
H . Such a directed graph H is called a graph representation of the equation 
system (pQ). Note that the condition that the characteristic vectors of the 
equations integrally generate the cycle space can be efficiently tested since 
it is equivalent to computing the value of the determinant of a matrix as 
explained in the previous paragraph. 

The proof of Theorem [3] follows the lines of the one for Theorem [2l In 
this case we use the following colored version of Lemma El 

Lemma 7 (Removal Lemma for arc-colored directed graphs). Let m be a 
fixed integer and H a directed graph with its arcs colored with m colors. If 
a directed graph G with edges colored with m colors contains less than o{n ) 
copies of H (the colors of edges in the copy and H must be the same), there 
exists a set E of at most o(n 2 ) arcs such that the graph obtained from G by 
removing the arcs contained in E is H-free. 

Lemma [7] can be proved by combining the proof of Lemma [6] with the 
edge-colored version of the Regularity Lemma stated for instance in O 
Lemma 1.18]. 

Proof of Theorem [H Let H be a graph representation of the equation system 
([T|). We can assume without loss of generality that H is connected. We 
view the arc corresponding to the variable Xi as colored with the color i. In 



this way, the arcs of H are colored with numbers from 1 to to. Since the 
dimension of the cycle space of H is k (as the characteristic vectors of the 
equations from (T1J are assumed to be independent) and H is comprised of 
to arcs, the number of the vertices of -ff is h = m — k + 1. 

Next, we construct an auxiliary directed graph Hq. The vertex set of 
Hq is G x V(H). For every arc (u, v) of H associated with Xi, the directed 
graph Hq contains -/V|j4j| arcs from (g, u) to (ga, v), one for each g € G and 
each a G A{. The arc from (g,u) to (ga,v) is colored i and labeled by the 
pair [a,i]. The order of H is Nq = hN, its size is iV(|Ai| + • • • + \A m \) and 
its arcs are colored with numbers 1, . . . , m. We call Hq the blowup graph of 
Hby A 1 ,...,A m . 

Let H' be a subgraph of Hq isomorphic to H (preserving the colors). The 
arc of H' colored with i is an arc from a vertex (g, u) to a vertex (gat,v) for 
some cij G Ai. Setting X{ = en yields a solution of the system ((TJ: indeed, 
if C is a cycle corresponding to the j-th equation, then the cycle C is also 
present in H' as a cycle (gi, u±)(g2, U2) ■ ■ ■ (gi,vi). If 7* is the color of the arc 
((gt,u t ) , (g t +i,Ut+i)) (indices taken modulo I), then a lt = g t +i - gt, if the 
arc is traversed in its direction, and a 7t = gt — gt+i otherwise, and thus 



^2 (9i+i -9i) = Yl e i~ii a ~h = e ii a i 



Note that we can freely rearrange the summands in the above equation as 
the group G is abelian. 

We have seen that every subgraph of Hq isomorphic to H corresponds 
to a solution of the system (pQ). Let us now show that every solution of 
([1]) corresponds to N edge disjoint copies of H. Fix a vertex uq of H, an 
element z of G and a solution of the system a± G A\, . . . , a m G A m . Define 
ip : V(H) — ► G such that (p(uo) = z and ¥?(«') — <p(u) = ai for an arc (u, v!) 
of H corresponding to the variable X{. By the graph representability of the 
system, the function ip is well defined: if there are two paths from uq to a 
vertex u in H they close a cycle C which can be expressed as an integral 
linear combination of the cycles in the system. Since the aj's form a solution 
of the system, the sum of the labels on the edges along each of the cycles 
arising from the system is zero, and therefore this is also the case for C. 
Since H is connected, the set of vertices {(«, <p(u)),u G V(H)} induce a 
copy of H in Hq. Since there are ./V choices for z, and two different choices 
yield edge-disjoint copies of H, every solution of the system with a^ G Ai 
gives rise to ./V edge-disjoint copies of H . 

The proof now proceeds as in Theorem [2] except that instead of a cycle 



of length m we aim to consider copies of the graph H. Fix do > and apply 
Lemma[7]for 5 = 5o/h h which yields 5' > 0. If there are less than 5oN m ~ k = 
SoN 11-1 solutions of the system ([1]), the directed graph Hq contains at most 
SoN h = 5Nq distinct copies of H. By the choice of 5, there is a set E of at 
most 5'Nq arcs such that H$ \ E has no copy of H. 

Let Bi be the set of those elements a G A^ such that E contains at least 
N/m arcs ((g,u), (ga,v)) colored with i. Since \E\ < 8'Nfi, the size of each 
Bi is at most m\E\/N < 5'mN^/N = 8'mh 2 N < 5'm 3 N. Set A' i = Ai\ B % . 
Since the size of Bi is bounded by 5'm 3 N, and 5' — > as 5$ — ► 0, the theorem 
will be proven after we show that there is no solution of the system ([I]) with 

< e A\. 

Assume that there is a solution a[ , . . . , a' m of the equation system ([1]) 
such that a\ £ A\ and consider the A^ disjoint copies of H corresponding 
to this solution. For every i, the N copies of H contain together N arcs 
colored with i that are of the form ((g,u), (ga^v)). Hence, there exists an 
io such that E contains at least N/m arcs that are colored with i$ and are 
of the form ((g,u), {ga' t ,v)). Consequently, a[ G Bi and thus a[ g" A\ which 
violates the choice of the solution. □ 

We have already mentioned that, if the characteristic vectors of the equa- 
tions from a system corresponding to fundamental cycles of a graph H with 
respect to one of its spanning trees, then the equation system is graph- 
representable. If there exists a representation of this special type, then we 
say the system is strongly graph representable. More precisely, the system 
([2]) is strongly graph representable if there is a directed graph H with m 
arcs colored by 1, . . . , m and a spanning tree T of H such that the funda- 
mental cycles of H with respect to T are cycles Cj defined as follows: Cj 
is the cycle traversing the arcs of H in the order en . . . ej m where some of 
eij are "empty", i.e., they do not define an arc of H. If ti ai (j) = +1> then 
e-ij is the arc colored with (Ti(j) traversed in its direction, if e^Q) = — 1, 
then e^ is the arc colored with (Ji{j) traversed in the opposite direction, and 
if ei^y) = 0, then e^ is empty. Note that the condition on the equation 
system being strongly representable implies that every equation contains a 
variable that is not in any of the other equations. An example can be found 
in Figure [T] where the graph strongly represents the equation system arising 
from Corollary [5j 

This stronger condition suffices to extend Theorem [3] to the non-abelian 
case. 

Proof of Theorem [7} The proof is analogous to the one of Theorem [3l Let 



H be a a strong representation of the system. In particular, H has m edges 
and h = m — k + 1 vertices. Let -Ho he the graph with vertex set G x V(-ff") 
that contains with an arc ((g,u),(ga,v)) for each arc (u,v) in H that has 
color i and each o£ij. Such an arc has also color % in Hq. 

Let if' be a subgraph of ifo that is isomorphic to H (preserving the 
colors of the edges). If ((g, u), (g',v)) is an arc of H' colored with i, set 
Xi = g~ 1 g' '• Observe that X{ E Aj. We claim that Oj is a solution of the 
equation system. Consider the i-th. equation and let Cj = en . . . ej m be 
the cycle corresponding to this equation in H (and thus in H'). Let gj, 
j = 0, . . . , m, be the element of the group G assigned to the vertex shared 
by the edges e^ and ejj+i. In particular, go = g m and if e^ is empty, then 
<7j_ 1 = 5j. We infer from the choice of Xi the following: 

m m 

n e 'o-,-(j) TT -1 —1 —1 -1 —1 i 

X *iV) = 11 9 J-l 9 i = 9 9l9 l ' ' ' 9m-l9 m -l9 m =% 9m = 1 ■ 

Hence, Xj's are indeed a solution of the equation system. 

On the other hand, each solution X{ G A{ gives rise to N edge-disjoint 
copies of H in Hq. Indeed, let T be a spanning tree of H such that the cycles 
Ci, . . . , C m corresponding to the equations of the system ([2]) are fundamental 
cycles with respect to T. Root T at an arbitrarily chosen vertex vq. Set g vo 
to an arbitrary element of G and define the values g v for other vertices of 
the graph H as follows: if v' is the parent of v in T and the arc vv' has color 
i and is oriented from v to t/, then g v = g v 'X^ ; if the arc is oriented from 
v' to v , then g„ = g^Xj. 

Let if' be the subgraph of Hq with the vertices (g v ,v) that contains the 
arc from (g v ,v) to (g v >,v') with color fc for every arc vv' of H with color &. 
In order to be sure that H' is properly defined, we have to verify that an 
arc from (g v ,v) to (g v ',v') with the color k is present in Hq. If vv' is an arc 
of T, then Hq contains the arc from (g v ,v) to (g v ',v') by the definition of 
g v . If vv' is not contained in T, there is exactly one equation in the system 
that contains the variable Xk- We infer by a simple manipulation from the 
definition of g v that g v > = g v x^. Since x^ € A^, the arc from (g v ,v) to 
(g v ',v') is contained in if and its colors is k. Since the choice of g vo was 
arbitrary, Hq contains N edge-disjoint copies of H. 

The rest of the proof is the same as the last three paragraphs of the 
proof of Theorem [3l □ 

As a final remark, we briefly discuss the condition of graph represent abil- 
ity. The key point in the proof of Theorem [3] is the correspondence between 
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copies of H and solutions of the system: every copy of H in the constructed 
graph Hq yields a solution of the system and every solution gives rise to N 
edge-disjoint copies of H . This correspondence can be broken if the system 
is not graph representable in the sense we have defined. For instance, in 
the example from Figure it is possible to express only 2C as an integer 
combination of the base and thus the stated correspondence need not exist 
for groups with elements of order two. 
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